In this work, we propose a novel image reconstruction framework that directly learns a neural implicit representation in k-space for ECG-triggered non-Cartesian Cardiac Magnetic Resonance Imaging (CMR). While existing methods bin acquired data from neighboring time points to reconstruct one phase of the cardiac motion, our framework allows for a continuous, binning-free, and subject-specific k-space representation.We assign a unique coordinate that consists of time, coil index, and frequency domain location to each sampled k-space point. We then learn the subject-specific mapping from these unique coordinates to k-space intensities using a multi-layer perceptron with frequency domain regularization. During inference, we obtain a complete k-space for Cartesian coordinates and an arbitrary temporal resolution. A simple inverse Fourier transform recovers the image, eliminating the need for density compensation and costly non-uniform Fourier transforms for non-Cartesian data. This novel imaging framework was tested on 42 radially sampled datasets from 6 subjects. The proposed method outperforms other techniques qualitatively and quantitatively using data from four and one heartbeat(s) and 30 cardiac phases. Our results for one heartbeat reconstruction of 50 cardiac phases show improved artifact removal and spatio-temporal resolution, leveraging the potential for real-time CMR.
translated by 谷歌翻译
多模式变压器的最新努力通过合并视觉和文本信息改善了视觉上丰富的文档理解(VRDU)任务。但是,现有的方法主要集中于诸如单词和文档图像贴片之类的细粒元素,这使得他们很难从粗粒元素中学习,包括短语和显着视觉区域(如突出的图像区域)等自然词汇单元。在本文中,我们对包含高密度信息和一致语义的粗粒元素更为重要,这对于文档理解很有价值。首先,提出了文档图来模拟多层次多模式元素之间的复杂关系,其中通过基于群集的方法检测到显着的视觉区域。然后,提出了一种称为mmlayout的多模式变压器,以将粗粒的信息纳入基于图形的现有预训练的细颗粒的多峰变压器中。在mmlayout中,粗粒信息是从细粒度聚集的,然后在进一步处理后,将其融合到细粒度中以进行最终预测。此外,引入常识增强以利用天然词汇单元的语义信息。关于四个任务的实验结果,包括信息提取和文档问答,表明我们的方法可以根据细粒元素改善多模式变压器的性能,并使用更少的参数实现更好的性能。定性分析表明,我们的方法可以在粗粒元素中捕获一致的语义。
translated by 谷歌翻译
在异质图上的自我监督学习(尤其是对比度学习)方法可以有效地摆脱对监督数据的依赖。同时,大多数现有的表示学习方法将异质图嵌入到欧几里得或双曲线的单个几何空间中。这种单个几何视图通常不足以观察由于其丰富的语义和复杂结构而观察到异质图的完整图片。在这些观察结果下,本文提出了一种新型的自我监督学习方法,称为几何对比度学习(GCL),以更好地表示监督数据是不可用时的异质图。 GCL同时观察了从欧几里得和双曲线观点的异质图,旨在强烈合并建模丰富的语义和复杂结构的能力,这有望为下游任务带来更多好处。 GCL通过在局部局部和局部全球语义水平上对比表示两种几何视图之间的相互信息。在四个基准数据集上进行的广泛实验表明,在三个任务上,所提出的方法在包括节点分类,节点群集和相似性搜索在内的三个任务上都超过了强基础,包括无监督的方法和监督方法。
translated by 谷歌翻译
近年来,可微弱的建筑搜索(飞镖)已经受到了大量的关注,主要是因为它通过重量分享和连续放松来显着降低计算成本。然而,更近期的作品发现现有的可分辨率NAS技术难以俯视幼稚基线,产生劣化架构作为搜索所需。本文通过将体系结构权重放入高斯分布,而不是直接优化架构参数,而不是直接优化架构参数,而是作为分布学习问题。通过利用自然梯度变分推理(NGVI),可以基于现有的码票来容易地优化架构分布而不会产生更多内存和计算消耗。我们展示了贝叶斯原则的可分解NAS如何益处,提高勘探和提高稳定性。 NAS-BENCH-201和NAS-BENCH-1SHOT1基准数据集的实验结果证实了所提出的框架可以制造的重要改进。此外,我们还在学习参数上只需简单地应用argmax,我们进一步利用了NAS中最近提出的无培训代理,从优化分布中汲取的组架构中选择最佳架构,从而实现最终的架构-ART在NAS-BENCH-201和NAS-BENCH-1SHOT1基准上的结果。我们在飞镖搜索空间中的最佳架构也会分别获得2.37 \%,15.72 \%和24.2 \%的竞争性测试错误,分别在Cifar-10,CiFar-100和Imagenet数据集上。
translated by 谷歌翻译
虽然可分辨率的架构搜索(飞镖)已成为神经结构中的主流范例(NAS),因为其简单和效率,最近的作品发现,搜索架构的性能几乎可以随着飞镖的优化程序而增加,以及最终的大小由飞镖获得几乎无法表明运营的重要性。上述观察表明,飞镖中的监督信号可能是架构搜索的穷人或不可靠的指标,鼓励有趣和有趣的方向:我们可以衡量不可分辨率范式下的任何培训的运作重要性吗?我们通过在初始化问题的网络修剪中定制NAS提供肯定的答案。随着最近建议的突触突触效力标准在初始化的网络修剪中,我们寻求在没有任何培训的情况下将候选人行动中的候选人行动的重要性进行评分,并提出了一种名为“免费可分辨的架构搜索}(Freedarts)的小说框架” 。我们表明,没有任何培训,具有不同代理度量的自由路由器可以在不同的搜索空间中优于大多数NAS基线。更重要的是,Freedarts是非常内存的高效和计算效率,因为它放弃了架构搜索阶段的培训,使得能够在更灵活的空间上执行架构搜索并消除架构搜索和评估之间的深度间隙。我们希望我们的工作激励从初始化修剪的角度来激发解决NAS的尝试。
translated by 谷歌翻译
随着深度学习技术的发展,基于卷积神经网络的多光谱图像超分辨率方法最近取得了很大的进展。然而,由于高光谱数据的高维和复谱特性,单个高光谱图像超分辨率仍然是一个具有挑战性的问题,这使得难以同时捕获空间和光谱信息。要处理此问题,我们提出了一种新的反馈精确的本地 - 全球网络(FRLGN),用于超光谱图像的超级分辨率。具体而言,我们开发新的反馈结构和本地全局频谱块,以减轻空间和光谱特征提取的难度。反馈结构可以传输高电平信息以指导低级特征的生成过程,其通过具有有限展开的经常性结构实现。此外,为了有效地使用所传回的高电平信息,构造局部全局频谱块以处理反馈连接。本地 - 全局频谱块利用反馈高级信​​息来校正来自局部光谱频带的低级功能,并在全局光谱频带之间产生强大的高级表示。通过结合反馈结构和局部全局光谱块,FRLGN可以充分利用光谱带之间的空间光谱相关性,并逐渐重建高分辨率高光谱图像。 FRLGN的源代码在https://github.com/tangzhenjie/frlgn上获得。
translated by 谷歌翻译
在知识库(复杂KBQA)上回答的复杂问题是具有挑战性的,因为它需要各种组成推理功能,例如多跳推断,属性比较,集合操作。现有的基准有一些缺点,这些缺点限制了复杂的KBQA的发展:1)它们仅提供质量检查对而没有明确的推理过程; 2)问题的多样性或规模很差。为此,我们介绍了KQA Pro,这是一个用于复杂KBQA的数据集,包括〜120k多样化的自然语言问题。我们引入了一种构图和可解释的编程语言KOPL,以表示复杂问题的推理过程。对于每个问题,我们都提供相应的KOPL程序和SPARQL查询,因此KQA Pro可用于KBQA和语义解析任务。实验结果表明,SOTA KBQA方法无法像当前数据集上的KQA Pro上实现有希望的结果,这表明KQA Pro具有挑战性,复杂的KBQA需要进一步的研究工作。我们还将KQA Pro视为用于测试多种推理技能的诊断数据集,对现有模型进行彻底评估,并讨论复杂KBQA的进一步说明。我们的代码和数据集可以从https://github.com/shijx12/kqapro_baselines获得。
translated by 谷歌翻译
Point cloud analysis is very challenging, as the shape implied in irregular points is difficult to capture. In this paper, we propose RS-CNN, namely, Relation-Shape Convolutional Neural Network, which extends regular grid CNN to irregular configuration for point cloud analysis.The key to RS-CNN is learning from relation, i.e., the geometric topology constraint among points. Specifically, the convolutional weight for local point set is forced to learn a high-level relation expression from predefined geometric priors, between a sampled point from this point set and the others. In this way, an inductive local representation with explicit reasoning about the spatial layout of points can be obtained, which leads to much shape awareness and robustness. With this convolution as a basic operator, RS-CNN, a hierarchical architecture can be developed to achieve contextual shape-aware learning for point cloud analysis. Extensive experiments on challenging benchmarks across three tasks verify RS-CNN achieves the state of the arts.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译